Implementable Failure Detectors in Asynchronous Systems
نویسندگان
چکیده
Failure detection is one of the most fundamental modules of any fault-tolerant distributed system. The failure detectors discussed in the literature so far are either impossible to implement in an asynchronous system, or their exact guarantees have not been discussed. We introduce a failure detector called innnitely often accurate failure detector which can be implemented in an asynchronous system. We provide one such implementation and show its application to the fault-tolerant server maintenance problem. We also show that some natural timeout based failure detectors implemented on Unix are not suucient to guarantee innnitely often accuracy.
منابع مشابه
Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication
We study the problem of achieving reliable communication with quiescent algorithms (i.e., algorithms that eventually stop sending messages) in asynchronous systems with process crashes and lossy links. We first show that it is impossible to solve this problem without failure detectors. We then show how to solve it using a new failure detector, called heartbeat. In contrast to previous failure d...
متن کاملOn Quiescent Reliable Communication
We study the problem of achieving reliable communication with quiescent algorithms (i.e., algorithms that eventually stop sending messages) in asynchronous systems with process crashes and lossy links. We first show that it is impossible to solve this problem in asynchronous systems (with no failure detectors). We then show that, among failure detectors that output lists of suspects, the weakes...
متن کاملDistributed Predicate Detection in a Faulty Environment
There has been very little research in distributed predicate detection for faulty, asynchronous environments. In this paper we deene a class of predicates called set decreasing predicates which can be detected in such an environment. We introduce a set of failure detectors called innnitely often accurate detectors which are implementable in asynchronous systems. Based on these failure detectors...
متن کاملFail-Aware Failure Detectors
In existing asynchronous distributed systems it is impossible to implement failure detectors which are perfect, i.e. they only suspect crashed processes and eventually suspect all crashed processes. Some recent research has however proposed that any “reasonable” failure detector for solving the election problem must be perfect. We address this problem by introducing two new classes of fail-awar...
متن کاملStubborn Communication Channels
This paper aims at bridging the gap between the assumption of reliable channels by fault-tolerant distributed algorithms and the weak reliability of feasible communication channels. We deene a new kind of communication channels which we call Stubborn channels. Stubborn channels are easily implementable over a connectionless network layer and, although weak, the reliability guarantees ooered by ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1998